本篇文章將特徵之間的關係做一個簡單的連結並產生新的特徵,產生新特徵這個動作在連結不同要素的影響時是很重要的,例如同時購買a與b一個特徵,以及買a、買b兩個分開的特徵,在意義上是不一樣的。
features = features.drop(['Utilities', 'Street', 'PoolQC',], axis=1)
# 根據其他特徵的文字敘述 組成新的特徵
features['YrBltAndRemod']=features['YearBuilt']+features['YearRemodAdd']
features['TotalSF']=features['TotalBsmtSF'] + features['1stFlrSF'] + features['2ndFlrSF']
features['Total_sqr_footage'] = (features['BsmtFinSF1'] + features['BsmtFinSF2'] +
features['1stFlrSF'] + features['2ndFlrSF'])
features['Total_Bathrooms'] = (features['FullBath'] + (0.5 * features['HalfBath']) +
features['BsmtFullBath'] + (0.5 * features['BsmtHalfBath']))
features['Total_porch_sf'] = (features['OpenPorchSF'] + features['3SsnPorch'] +
features['EnclosedPorch'] + features['ScreenPorch'] +
features['WoodDeckSF'])
有些變數是泳池、車庫面積等等,但是這沒有在模型中明確的表示出這棟房子擁有這個特徵,因此新增有泳池、車庫等等的特徵
features['haspool'] = features['PoolArea'].apply(lambda x: 1 if x > 0 else 0)
features['has2ndfloor'] = features['2ndFlrSF'].apply(lambda x: 1 if x > 0 else 0)
features['hasgarage'] = features['GarageArea'].apply(lambda x: 1 if x > 0 else 0)
features['hasbsmt'] = features['TotalBsmtSF'].apply(lambda x: 1 if x > 0 else 0)
features['hasfireplace'] = features['Fireplaces'].apply(lambda x: 1 if x > 0 else 0)
features.shape